bootloader: Fix a few things related to systemd-boot installs#2191
bootloader: Fix a few things related to systemd-boot installs#2191cgwalters wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces the MountedImageRoot abstraction to manage the mounting of composefs images and the EFI System Partition (ESP) within a temporary directory. This change allows bootctl to be invoked with the --root flag, enabling it to read configuration from the target OS. Feedback identifies a potential issue where the KERNEL_INSTALL_CONF_ROOT environment variable might resolve to the host's root instead of the temporary mount; a suggestion is provided to use the host-absolute path to ensure the kernel entry token is written to the correct location.
| .env("SYSTEMD_RELAX_ESP_CHECKS", "1") | ||
| // Redirect the entry-token write into the ESP (writable FAT) rather | ||
| // than the EROFS composefs root. | ||
| .env("KERNEL_INSTALL_CONF_ROOT", &esp_path_in_root) |
There was a problem hiding this comment.
The KERNEL_INSTALL_CONF_ROOT environment variable is used by bootctl to determine where to read and write the kernel entry token. When bootctl is invoked with --root, it prefixes paths provided via CLI arguments, but it typically does not prefix absolute paths provided via environment variables.
Since esp_path_in_root is an absolute path (e.g., /efi), bootctl will likely attempt to access the host's /efi directory instead of the ESP mounted within the temporary root. To ensure the entry token is correctly redirected to the writable ESP, you should provide the absolute path on the host where the ESP is currently mounted.
| .env("KERNEL_INSTALL_CONF_ROOT", &esp_path_in_root) | |
| .env("KERNEL_INSTALL_CONF_ROOT", prepared_root.root_path().join(prepared_root.esp_subdir)) |
|
/packit test |
|
/packit build |
There was a problem hiding this comment.
One nitpick, one thing I think should be changed.
We used the following redirect during our testing and that made things work, it's largely similar to the new arguments passed here:
Note that we had to set the --entry-token as it couldn't find one in /etc/kernel/entry-token and couldn't find /etc/os-release. The behavior is as follows:
If set to auto (the default), the /etc/kernel/entry-token file will be read if it exists, and the stored value used. Otherwise, if the
local machine ID is initialized it is used. Otherwise, IMAGE_ID= from os-release will be used, if set. Otherwise, ID= from os-release
will be used, if s
#!/usr/bin/env python3
import subprocess as sp
import sys
def main():
args = sys.argv[1:]
for idx, arg in enumerate(args):
if arg == "--esp-path":
args[idx+1] = "/boot/efi"
args.extend(["--root", "/run/osbuild/mounts"])
args.extend(["--entry-token", "literal:simonos"])
cmd = ["/usr/bin/bootctl.orig"] + args
print(" ".join(cmd), file=sys.stderr)
sp.run(cmd, check=True)
if __name__ == "__main__":
main()
But you're setting |
That makes sense 🙂. |
fa98cd1 to
8af07fe
Compare
|
I can confirm that with the RPM from this PR installed in the container we get past the bootloader installation. |
| // Per BLS: if only an ESP is present (no XBOOTLDR), mount it to /boot. | ||
| // If both ESP and XBOOTLDR are present, mount the ESP to /efi instead. | ||
| // /boot/efi is explicitly discouraged by the spec. | ||
| let esp_subdir = if composefs |
There was a problem hiding this comment.
Here I think we're checking for existence of "efi" dir in the EROFS. Shouldn't we be iterating through disk partitions to find/not-find XBOOTLDR?
There was a problem hiding this comment.
We were already using the partitions to find the source, this is about the mount target, and there's a bigger picture issue, Fedora derivatives are missing /efi by default.
There was a problem hiding this comment.
(bit of a sidenote but it might be a good idea to move to using /boot + /efi in Fedora, I'll discuss with a few people if it's worth a change proposal. Especially now that we have appropriate RPM macros).
See: osbuild/image-builder-cli#506 Right now most of our testing for composefs installs uses bcvk which uses `to-disk` in an ephemeral VM. That masks some issues, like the fact that `bootctl install` by default touches the ESP variables. Wire things up with `--generic-image` so it behaves the same as the bootupd backend. Now, the next problem is that osbuild actually makes `/` readonly (which is great!) and in fact we should change our recommended `podman run` invocations to pass `--read-only`. But that reveals the next bug, which is that bootctl installs wants to write an `/etc/kernel/entry-token` which we basically don't use. Redirect those writes to a tempdir for now, though we should probably try to get upstream patches for that. Also set SYSTEMD_RELAX_ESP_CHECKS=1 so bootctl skips the partition-type GUID check for the ESP - we don't want it to e.g. try to look up things in the udev database, which might not be available. Assisted-by: OpenCode (claude-sonnet-4-6) Signed-off-by: Colin Walters <walters@verbum.org>
|
ok yeah a definite pain point here is our rpm artifacts expire, so one can't retry just the flaky jobs after a while. Also I think what we want here is to change the logic so that The problem is here if we opt in to more ci with ci/merge label, it then blocks adding to the merge queue until those jobs pass. Whereas we'd get automatic retries in the merge queue. |
Yeah this is definitely annoying. I burned some tokens recently to try and get it to untangle things such that if only one subset of the jobs flakes (say everything passes except fedora-44-x86_64) it would only need to re-build the fedora-44-x86_64 package in order to re-run the fedora-44-x86_64 job. Because currently it has to rebuild all of the packages which then in turn triggers all of the jobs to re-run. But anyway my token burning didn't yield any immediate obvious way to decouple it but maybe throwing more compute at the problem, be it GPU or human brain power, could improve this substantially. |
|
I previously generated https://github.com/cgwalters/merge-queue-like-bors though I'm a bit wary to deploy it on this repo. Maybe let's start with bcvk or another one? It should fix this problem though. |
See: osbuild/image-builder-cli#506
Right now most of our testing for composefs installs uses bcvk which uses
to-diskin an ephemeral VM. That masks some issues, like the fact thatbootctl installby default touches the ESP variables. Wire things up with--generic-imageso it behaves the same as the bootupd backend.Now, the next problem is that osbuild actually makes
/readonly (which is great!) and in fact we should change our recommendedpodman runinvocations to pass--read-only. But that reveals the next bug, which is that bootctl installs wants to write an/etc/kernel/entry-tokenwhich we basically don't use. Redirect those writes to a tempdir for now, though we should probably try to get upstream patches for that.Also set SYSTEMD_RELAX_ESP_CHECKS=1 so bootctl skips the partition-type GUID check for the ESP - we don't want it to e.g. try to look up things in the udev database, which might not be available.
Assisted-by: OpenCode (claude-sonnet-4-6)